Univariate statistical analysis of environmental (compositional) data: problems and possibilities.
نویسندگان
چکیده
For almost 30 years it has been known that compositional (closed) data have special geometrical properties. In environmental sciences, where the concentration of chemical elements in different sample materials is investigated, almost all datasets are compositional. In general, compositional data are parts of a whole which only give relative information. Data that sum up to a constant, e.g. 100 wt.%, 1,000,000 mg/kg are the best known example. It is widely neglected that the "closure" characteristic remains even if only one of all possible elements is measured, it is an inherent property of compositional data. No variable is free to vary independent of all the others. Existing transformations to "open" closed data are seldom applied. They are more complicated than a log transformation and the relationship to the original data unit is lost. Results obtained when using classical statistical techniques for data analysis appeared reasonable and the possible consequences of working with closed data were rarely questioned. Here the simple univariate case of data analysis is investigated. It can be demonstrated that data closure must be overcome prior to calculating even simple statistical measures like mean or standard deviation or plotting graphs of the data distribution, e.g. a histogram. Some measures like the standard deviation (or the variance) make no statistical sense with closed data and all statistical tests building on the standard deviation (or variance) will thus provide erroneous results if used with the original data.
منابع مشابه
The Use of Robust Factor Analysis of Compositional Geochemical Data for the Recognition of the Target Area in Khusf 1:100000 Sheet, South Khorasan, Iran
The closed nature of geochemical data has been proven in many studies. Compositional data have special properties that mean that standard statistical methods cannot be used to analyse them. These data imply a particular geometry called Aitchison geometry in the simplex space. For analysis, the dataset must first be opened by the various transformations provided. One of the most popular of the a...
متن کاملبررسی نقش میانجی راهبردهای ترکیبی رقابتی و رویکرد مبتنی بر منابع در تأثیر ساختار بر عملکرد سازمانی
Today, companies in the challenging conditions will be successful while acquiring sufficient knowledge and recognition regarding environmental challenges create progress and improvement in their performance. In this regard, the present study has investigated the effect of structure on organizational performance taking into account the intermediary role of competitive-compositional strategies an...
متن کاملLecture Notes on Compositional Data Analysis
Preface These notes have been prepared as support to a short course on compositional data analysis. Their aim is to transmit the basic concepts and skills for simple applications, thus setting the premises for more advanced projects. One should be aware that frequent updates will be required in the near future, as the theory presented here is a field of active research. The notes are based both...
متن کاملDecomposing compositional data: minimum chi-squared reduced-rank approximations on the simplex
The logratio transformation (Aitchison, 1981, 1986) opened the way to statistically rigorous analysis of compositional data. Most statistical problems in compositional data analysis can be formulated in terms of logratios, and solved accordingly. However, some problems are more easily described and solved in terms of raw compositional variables. Raw compositions with k components are restricted...
متن کاملThe bivariate statistical analysis of environmental (compositional) data.
Environmental sciences usually deal with compositional (closed) data. Whenever the concentration of chemical elements is measured, the data will be closed, i.e. the relevant information is contained in the ratios between the variables rather than in the data values reported for the variables. Data closure has severe consequences for statistical data analysis. Most classical statistical methods ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- The Science of the total environment
دوره 407 23 شماره
صفحات -
تاریخ انتشار 2009